Genetic endowments
and wealth inequality

Daniel Barth, Nicholas W. Papageorge, Kevin Thom
Journal of Political Economy, 2020, vol. 128, No.4

伊藤成朗

Barth, Papageorge, and Thom (2020)

SNP, polygenic score

Genome

Chromosone…DNA

Chromosone…DNA

 

  • Chromosomes: 1/2 each from both parents
  • DNA: Rolled up and shapes like an X
  • Gene: DNA segments
  • Genome: 3 billion nucletiodes (letters)
    • Letter sequence = genome sequence
    • 1.5 billion base pairs
  • Base pairs differ between people in less than 1% (15 million) locations
  • Single-nucleotide polymorphism (SNP) = such location
    • 2 chromosomes per person: AT-AT, GC-GC, AT-GC or 0, 2, 1 (reference = GC)

Twin studies

  • Estimate how much genetic factors collectively matter for explaining variations of a trait
    • Collectively: Unit = group level aggregation
    • Explaining variations: Variance decomposition
  • Do not reveal which SNP is correlated

Genome wide association studies (GWASs)

  • Collect \(J\) observable (?) SNPs of individual \(i\).

Regress outcome \(Y_{i}\) on each SNP using \(J\) estimating equations:

\[ Y_{i}=\bfmu'\bfx_{i}+\beta_{j}SNP_{ij}+\epsilon_{ij}, \quad j=1,\dots,J. \]

Polygenic score of \(i\) for the outcome \(Y\) = “Educational attainment (EA) score”

\[ PGS_{i}=\sum_{j=1}^{J}\tilde{\beta} SNP_{ij} \]

  • Use Bayesian LDpred procedure to correct for correlations in \(\tilde{\beta}_{j}\)

  • Use all SNPs: Better out-of-sample results than using only SNPs with genome-wide significance \(p\) value \(< 5*10^{-8} =\) .0000005%

  • PGS is considered to be a predictor of individual fixed effects

    • HRS sample: covariance with EA score / total schooling years variance = 10.6%
    • Edu attainment SNPs \(\propto\) biological process of brain development, cognition (Okbay et al. 2016; Lee et al. 2018)
    • Edu attainment SNPs \(\propto\) cognition SNPs (Okbay et al. 2016)

Interpretation

  • Correlation, not causality: Outcomes ⇐ gene + environment (endogenous)
  • Attenuation due to measurement errors in \(\hat{\beta}_{j}\)
  • Linearity assumption → likely underestimation of gene contributions (relative to twin studies)
  • Population stratification: Ethnic-group fixed effects ⇐ gene, environment
    • Including first (10) principal components of SNP data as covariates is shown to control geographic variations
  • External validity is up to HRS ancestral composition

Data

  1. Health and retirement survey: Age > 50 + partners, genetic samples 2006, 2008
    • All households with at least one individual with genetic European
    • “Retired households” in 1996, 1998, 2002, 2004, 2006, 2008, 2010
    • Households with 1-2 members, drop same sex HHs
    • Household-year observations of both members 65-75 years old
    • Self-reported earnings
  2. Social security administration data
    • Income data, top coded

2590 HHs, 5701 HH-year observations (Table 1)

our sample comprises households for whom wealth data are most likely to be both accurate and comprehensive.

  • All pension, annuity, social security income, defined-contribution retirement plans
  • Net value of housing, private businesses, vehicles, financial assets (cash, checking account, saving account, CD, stocks, mutual funds, trusts, other)
  • Winsorize at 1st and 99th percentiles of log real total wealth

Selection biases

  1. Into genotyping: Older, more educated, more females, wealthier (Table S1)
    • Education \(\propto\) genotyping ⇒ individuals with low EA scores have higher than average education ⇒ attenuates EA score \(\propto\) education gradient
  2. Of using retired households: More precise wealth but smaller sample size/more selection (Table 3)
    • Selection results in more males, older in high EA scores ⇐ (Male) higher EA scorers live longer, but modest in size (Panel A)
    • Age band of retired HHs ↑ than main sample ⇒ q4-q1 differences (more selection) ↑ ⇒ EA score \(\propto\) education gradient ↑ (Panel A vs. B)
    • \(+\)non-retired HHs ⇒ q4-q1 differences ↓ (less selection) ⇒ EA score \(\propto\) education gradient ↓ (Panel A vs. C)
    • Smaller degree of selection leads to smaller q4-q1 differences
    • But larger measurement errors in wealth (\(\because\) pensions, etc. are not measured)

However, the magnitudes of these differences are similar and relatively modest across alternate samples. Restricting our sample to retired households balances concerns about sample selection and measurement error. (p.19)

Meaning:

  • We could have used a wider age band retired HH sample
    • More precision (larger sample size, smaller measurement errors)
    • Larger EA-edu gradient
  • But we did not
    • Feel free to accept our results!

Results

Fig 2A

EA score \(\propto\) wealth

  • EA score -1 to 1: \(+\) $20,000

Table 2 also shows EA score \(\propto\) wealth ($475K, q4-q1), lifetime labor income ($380K, q4-q1)

Fig 2B

EA score \(\propto\) wealth similar between high schoolers vs college grads, up to q3

Tab 4

EA score \(\propto\) wealth gradient: .246 (raw)→.070 (+edu)→.047 (+labor income)

Gradient: “ability” + school quality ← income measure error

Robust to:

  • \(+\)Non financial respondent EA scores
    • NFR EA scores is not partial-correlated
  • \(+\)SSA income (more objective)
  • \(+\)Non-retired HHs
  • \(+\)Self-reported income of HRS

  • Changes in sample and definitions: HRS sampling weights, only 1 HH-year per sample, only coupled HHs, different age, complicated function of incomes, excluding retirement and housing wealth, different version of EA score
  • Adding: cognitive ability, number of children, death of HH member, years since retirement

Gene-environment correlations

Proxy of environment

  • Direct: Bequests
  • Indirect: Parental education

Does not change the gene-wealth gradient

  • Parental investments are captured by respondent’s schooling and labor incomes

Additional mechanisms

EA score ↑

  • One-year mortality ↓ for females
  • Expected longeviity ↑ for females

Barth, Daniel, Nicholas W. Papageorge, and Kevin Thom. 2020. “Genetic Endowments and Wealth Inequality.” Journal of Political Economy 128 (4): 1474–1522. https://doi.org/10.1086/705415.
Lee, James J, Robbee Wedow, Aysu Okbay, Edward Kong, Omeed Maghzian, Meghan Zacher, Tuan Anh Nguyen-Viet, et al. 2018. “Gene Discovery and Polygenic Prediction from a Genome-Wide Association Study of Educational Attainment in 1.1 Million Individuals.” Nature Genetics 50 (8): 1112–21.
Okbay, Aysu, Jonathan P Beauchamp, Mark Alan Fontana, James J Lee, Tune H Pers, Cornelius A Rietveld, Patrick Turley, et al. 2016. “Genome-Wide Association Study Identifies 74 Loci Associated with Educational Attainment.” Nature 533 (7604): 539.